How-To Guides
Backend Local Setup
To run the backend, first setup a Python virtual environment (e.g. conda
or venv
) and install the dependencies.
With venv
:
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
In order to execute the code generated by the science agent, you need docker installed on your system. If you are using a Linux system please note that by default docker requires root access to run. You can either add your user to the docker group, run the backend with root privileges, or configure docker in rootless mode.
To build the docker image for running generated programs, run the following command:
cd sci_agent_docker
docker build -t science-agent .
The backend makes use of AWS Bedrock, S3, and DynamoDB and requires AWS credentials even when running locally. Setup your AWS configuration (~/.aws/config
) with the following properties at minimum:
[default]
aws_access_key_id = {YOUR_AWS_ID}
aws_secret_access_key = {YOUR_AWS_KEY}
Alternatively, use the aws-cli to set up your local config. e.g.
aws configure
You will need to create an S3 bucket and DynamoDB table or use services that are API-compatible. Modify the backend/config.py
file to set the AWS region, S3 bucket name, and other configurations as needed. To configure your LLM provider, you may modify the LLM_ENGINE_NAME
variable to use a different LLM engine. Currently, the OpenAI, Amazon Bedrock, and Google Gemini APIs are supported. If you do not use Amazon Bedrock, you will need to set the LLM_API_KEY
environment variable with the appropriate API key.
The backend uses the Quart
library, which is a Python asynchronous web framework. To run the dev server, run the following:
cd backend
quart run
Frontend Local Setup
To run locally, first make sure Node and NPM are installed. Then run the following commands in your terminal to install the necessary dependencies:
cd frontend
npm install
Open .env.development
in the frontend
directory and configure the following property:
VITE_STATIC_FILE_BASE_URL=https://{your-s3-bucket-name}.s3.amazonaws.com
To start the frontend, run the following command and the frontend will be available at http://localhost:5173:
npm run dev
Deployment
To deploy the backend, the above instructions for running locally also apply, except you should use a production-ready ASGI server such as hypercorn
instead of the Quart dev server. See the official Quart documentation for more details on deploying Quart applications.
To deploy the frontend, open .env.production
in the frontend
directory and configure the following properties (using the URL or server IP address where your backend is hosted):
VITE_API_BASE_URL=https://{your-backend-url}
VITE_STATIC_FILE_BASE_URL=https://{your-s3-bucket-name}.s3.amazonaws.com
Next, build the production version of the frontend by running:
npm run build
This will create a dist
directory containing the production build of the frontend.
To serve the production build, you can serve the files in the dist
directory using any static file server.
Example Tasks
To load all the tasks from the ScienceAgentBench dataset and make them available as example tasks,
download the dataset into the backend/benchmark
directory.
Note that you need to download the full benchmark.zip
using the link in the repository README file.
Once you have downloaded and extracted the data, run the following script:
cd backend
python preload_benchmark.py